Co-Training with Insufficient Views

نویسندگان

  • Wei Wang
  • Zhi-Hua Zhou
چکیده

Co-training is a famous semi-supervised learning paradigm exploiting unlabeled data with two views. Most previous theoretical analyses on co-training are based on the assumption that each of the views is sufficient to correctly predict the label. However, this assumption can hardly be met in real applications due to feature corruption or various feature noise. In this paper, we present the theoretical analysis on co-training when neither view is sufficient. We define the diversity between the two views with respect to the confidence of prediction and prove that if the two views have large diversity, co-training is able to improve the learning performance by exploiting unlabeled data even with insufficient views. We also discuss the relationship between view insufficiency and diversity, and give some implications for understanding of the difference between co-training and co-regularization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigation of Co-training Views and Variations for Semantic Role Labeling

Co-training, as a semi-supervised learning method, has been recently applied to semantic role labeling to reduce the need for costly annotated data using unannotated data. A main concern in co-training is how to split the problem into multiple views to derive learning features, so that they can effectively train each other. We investigate various feature splits based on two SRL views, constitue...

متن کامل

Multi-class Co-training Learning for Object and Scene Recognition

It is often tedious and expensive to label large training data sets for learning-based object and scene recognition systems. This problem could be alleviated by semi-supervised learning techniques, which can automatically select more training samples from unlabel data for reducing the cost of labeling. In this paper, we proposed a multi-class co-training learning method of two different views f...

متن کامل

Analyzing Co-training Style Algorithms

Co-training is a semi-supervised learning paradigm which trains two learners respectively from two different views and lets the learners label some unlabeled examples for each other. In this paper, we present a new PAC analysis on co-training style algorithms. We show that the co-training process can succeed even without two views, given that the two learners have large difference, which explai...

متن کامل

Co-training for Statistical Machine Translation

I propose a novel co-training method for statistical machine translation. As co-training requires multiple learners trained on views of the data which are disjoint and sufficient for the labeling task, I use multiple source documents as views on translation. Co-training for statistical machine translation is therefore a type of multi-source translation. Unlike previous mutli-source methods, it ...

متن کامل

Learning with Weak Views Based on Dependence Maximization Dimensionality Reduction

Large number of applications involving multiple views of data are coming into use, e.g., reporting news on the Internet by both text and video, identifying a person by both fingerprints and face images, etc. Meanwhile, labeling these data needs expensive efforts and thus most data are left unlabeled in many applications. Co-training can exploit the information of unlabeled data in multi-view sc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013